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ABSTRACT 

An interactive package of computer programs has been developed for 
the analysis of time series data. The package^ called the Time Series 
Editor, is designed around the Box-Jenkins' statistical methodology of 
time series analysis. The Time Series Editor was developed for time- 
shared use on the Controlled Program/Cambridge Monitor System (CP/CMS) 
but could be easily modified to accommodate other time-sharing systems. 
The Time Series Editor assists in data preparation, entry, analysis and 
diagnostic testing. Utilization of the package requires only a limited 
knowledge of the computer system with all required user responses 
prompted by the Editor. 
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I. INTRODUCTION 



Operations researchers, statisticians, economists, market 
ing personnel, managers and many others are frequently faced 
with the need to analyze time series data. In most cases 
their objective is to discover patterns or recognizable be- 
havior in historical data that can be used to construct mathe 
matical models of the time series from which forecasts of 
future behavior can be obtained. The importance of being abl 
to forecast future behavior accurately cannot be overempha- 
sized. Whether the data be budget expenditures, populations, 
natural resource consumption, prices, demands, economic indi- 
cators, stock-market prices, manpower levels or whatever, 
decision makers concerned with planning for the future must 
base their decisions on their best predictions about the 
future behavior of the time series. 

Until the late 1960* s, the analysis techniques were pri- 
marily those of spectral analysis with heavy application of 
harmonic analysis and mathematical transform theory. Because 
of the mathematical sophistication required by the spectral 
approach, the analysis capability resided fairly exclusively 
in the hands of mathematicians and engineers. Consequently, 
many naive forecasting methods such as moving averages, ex- 
ponential smoothing and decomposition analyses were adopted 
by the majority of the decision makers. Since the late 1960*s 
the statistical analysis of time series, embodied primarily 
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in the methodology developed by Box and Jenkins [Ref* 2] 



has received widespread acceptance. Because the Box-Jenkins 
approach is described in a vocabulary more familiar to opera- 
tions researchers, statisticians, economists and managers, 
more and more business and government decision makers are 
building models from past data to use for planning into the 
future . 

Many algorithms and computer programs for performing the 
analyses required by the Box-Jenkins approach have been de- 
veloped and are readily available from many sources. One of 
the best sources is the collection of FORTRAN computer sub- 
routines which resides in the International Mathematical and 
Statistical Library (IMSL) [Ref. 4]. The major problem with 
using the available computer resources lies not with any de- 
ficiency of the algorithms or the prog rams , but with the very 
nature of the Box-Jenkins approach. The Box-Jenkins method 
is an iterative approach which is described in Figure 1. 

[See Wheelwright and Makridakis, ref. 7.] 




Stage 1 



Stage 2 



Stage 3 



Figure 1 



Box-Jenkins forecasting method 
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Figure 1 shows that the Box-Jenkins method is a multi- 
stage, iterative process. It begins with the postulation of 
general class of models which has been found, experimentally 
to be extremely rich. Thereafter, the modeling procedure 
continues as a trial-and-error process with several decision 
points where the analyst is required to select the next 
direction based on the best information available to him. 

Each stage of the process outlined in Figure 1 may consist 
of several steps, and, even with the existing computer soft- 
ware resources, the modeling process is usually very time 
consuming. For example, a typical Box-Jenkins time series 
analyst using a batch processing computer system with access 
to the IMSL library of subroutines might perform the follow- 
ing sequence of tasks: 

1. Prepare time series data. 

2. Plot and visually examine the time series looking for 
nons tationarity , trends, deterministic patterns, etc. 

3. Write a program to call the IMSL subroutine that 
calculates the mean, the variance, the autocorrela- 
tions and the partial autocorrelations. 

4. Plot the autocorrelations and partial autocorrelations. 
This provides the major information needed for 
identification of the time series. 

5. Write a program to call the IMSL subroutine which 
transforms the time series to adjust for seasonal 
patterns, nonstationary behavior or other behavior 
which deviates from that assumed by the class of 
models postulated. 

6. Repeat steps 2) through 4) using the transformed 
data . 

7. Review the statistical properties of the autocorre- 
lations and partial autocorrelations for tentative 
identification of the model. 
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. Write a program to call the IMSL program that esti- 
mates the model parameters and computes the 
residuals . 

9. Write a program to perform goodness of fit testSo 

10. Analyze the residuals following steps 1) through 9) 
just as was done with the original time series. 

11. Refine the model using the information obtained by 
the analysis of the residuals. 

12. Repeat steps 8) through 10). 

13. When an adequate model is obtained^ write a program 
to call the IMSL subroutine which forecasts and 
determines confidence intervals for future values 
of the time series. 

Between each pair of steps the user must manually intervene 
and make a subjective decision based on the information 
available. Thus, a great deal of user interaction is required 
in order to determine a mathematical model and a forecast 
equation. Even with rapid computer turnaround time, the 
process can easily consume a day or more of calendar time. 

This report describes an effort to alleviate some of the 
problems involved with modeling time series using the Box- 
Jenkins approach. An interactive computer package which pro- 
vides easy user access to the computational computer sub- 
routines available in IMSL and similar subroutine libraries 
was developed. The package, called the Time Series Editor, 
was written for time-shared use on the Naval Postgraduate 
School's Controlled Program/Cambridge Monitor System (CP/ 

CMS). Since all programs except the executive routine are 
written in FORTRAN, the Time Series Editor could be easily 
modified to accommodate other time-sharing systems. The 
Time Series Editor assists the user in data preparation. 
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entry, model construction, diagnostic testing and forecast- 
ing. Utilization of the package requires only a limited 
knowledge of CP/CMS. In fact, with the User's Guide pro- 
vided in this report, a complete Box-Jenkins time series 
analysis can be performed in a short time by even a naive 
computer user. For this reason, the Time Series Editor 
should be valuable as an instructional aid for laboratory 
use in a time series class. 

A brief description of the Box-Jenkins methodology is 
given in Chapter 2 to serve as a point of reference for the 
remaining material. Chapter 3 contains descriptions of 
each of the programs contained in the Time Series Editor. 
The use of the Time Series Editor is illustrated with an 
example time series which is given in Chapter 4. 

Chapter 5 contains a summary and recommendations for 
additions to the Time Series Editor. A User's Guide which 
includes an explanation of CP/CMS sufficient for utiliza- 
tion of the Time Series Editor is given in Appendix A. 
Sample user sessions, sample outputs and complete computer 
listings are also included in appendices. 
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II. BOX-JENKINS METHODOLOGY 



In this chapter a brief description of the Box-Jenkins 
time series modeling methodology is given. For a more de- 
tailed discussion the reader is referred to the texts by 
Anderson [Ref« 1], Box and Jenkins [Ref. 2], Pindyck and 
Rubinfeld [Ref. 6], and Nelson [Ref« 5]. The material pre- 
sented here is included primarily to serve as a point of 
reference for the program descriptions that follow in later 
chapters. It is also included here to aid the user in 
understanding the computer output and the questions asked 
by the Time Series Editor. 

A. PROPERTIES OF STATIONARY PROCESSES 

A discrete stochastic time series is a set of observa- 
tions y^, y^/.../ y^ generated sequentially in time by a 
set of jointly distributed random variables; i.e., the data 
^1'***' ^T ^ particular realization of a joint 

probability distribution f ( y ^ / y 2 / • • • / Y,p ) • ^ future obser- 

vation, Ym.T^ can be thought of as being generated by a condi 

X. * 

tional probability distribution function f ( Y „ , , | y w • • • ^ ) 

X * J- X 

given the realization through time T. The stochastic pro- 
cess which generates the time series is said to be station- 
ary if its properties are unaffected by a change of time 
origin; that is, if the joint probability distribution asso 
ciated with m observations y^ , y^ y^ , made at any 
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set of times t^, t^ 



t is the same as that distribution 
m 



associated with m observations +]^'***' made 



m 



at times t^+k . 

If a process is stationary, the probability distribution 
f (y^) is the same for all times t. Thus the process has a 
constant mean 



y 



00 

E[y,] = / yf (y ) dy 

— OO 



which defines the level about which it fluctuates, and a con- 
stant variance 



00 

= E[(y^-y)^] = (y-y)^f(y)dy 

.00 

which measures its variability about the mean level. Since 
the probability distribution f (y^) is the same for all times 
t, the mean and the variance can be estimated by the averages 
taken over time: 



N 






and 






N 



t=l 



t=l 



2 



The stationarity assumption also implies that the bi- 
variate distribution f (y / ) is the same for all times 

12 

t^ and t^ such that constant. The autocorrelation 

at lag k, p , is defined as: 



pk = 



E[(yt-y) (yt+k’P^ 



This is estimated by the time average: 



N-k 



■k ■ '? Z 'yt-'"’ 'S' 



t+k 



-y) )/a‘ 



t=l 
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The plot of the autocorrelation function vs. the lag k, 
called the cor r elogr am , is very useful for the purpose of 
determining if a process is stationary and for identifying 
the appropriate model. 

Another function which is important for purposes of 

identifying the appropriate linear time series model is the 

partial autocorrelation function. Let y^ = b^+ ^t + l"^ ^ 2 * 

y ^ + ... + b, where the b*s are the least squares 

-^t+2 k-1 t+k-1 ^ 

estimates of the linear regression coefficients (3*s) in 
the model 



to be the simple correlation of lag k for the adjusted series 

z ^ f 0 » 0 0 

B. AUTOREGRESSIVE-MOVING AVERAGE MODELS 

The general class of models postulated by Box and Jenkins 
for stationary time series is the class of linear models 
defined by ; 




Let be the residual of y^ after removing the linear effect 



z 



t 




The partial autocorrelation of lag k, denoted (J) , is defined 

KK 




2 t-2 



-e a 



q t-q 



( 1 ) 
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where {a } is a sequence of observations from a white noise 

2 

process (E [a ] = 0, var[a ] = a and E [a. a. , , ] = 0 for all 

t t a t t + K 

k > 0) and <p, , o . . ,<p , Q , are p+q+1 parameters that 

are to be estimated from the data* The model above is 
called a mixed autoregressive-moving average (ARMA) model of 
order (p^q). If q=0, the model is called an autoregressive 
model of order p, P=0f the model is called a mov- 

ing average model of order q, MA . Thus, the general linear 

q 

model of Box-Jenkins represents the current observation 
as a weighted sum of past observations and present and past 
random shock terms. The model is usually expressed in abbre- 
viated form using operator and transfer function notation: 



4) (B) = 9^ + 0 (B) 

where 4>(B) = 1-4)^B-<|)2B^- . . . -4>^B^ , 0(B) = 

B is the operator defined by B^ ”^t-k* 



(2) 

1-9-B-...-9 B^ and 
1 q 

In order to guaran- 



tee stationarity it is necessary that the autoregressive 
parameters satisfy certain conditions. The conditions can 
be summarized by stating that all roots of the polynomial 
equation 4>(B) = 0 (treating B as a dummy variable) must lie 

outside the unit circle. 

The tentative identification of the appropriate member of 
the general class (the identification of p and q) is accom- 
plished by comparing the sample autocorrelation and partial 
autocorrelation functions of the given time series with the 
theoretical autocorrelation and partial autocorrelation func- 
tions of members of the general linear class. For most 
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stationary time series an adequate fit can be found in a 
model with p and q relatively small, say three or less. 



C. HOMOGENEOUS NONSTATIONARY SERIES 

In many cases the time series of interest is not station 
ary. Instead, the probabilistic structure of the process 
which generates the time series may change with time. For 
example, there may be some sort of trend or seasonal pattern 
in the time serieSo If the process does, nevertheless, 
exhibit behavior which is somewhat homogeneous then the 
original time series can often be transformed into a station 
ary series that can be described by an ARMA model. 

A time series is said to be homogeneous nonstationary of 
order d if w^ = A y^ is a stationary series. Here A denotes 
the differencing operator: 



ayt = y^ - yt-i 

= A(A'‘‘^yj.) 



(l-B)y^ 



That a time series is nonstationary is indicated by a plot 
of the time series itself (e«g. a nonconstant mean) and by 
the autocorrelation function. Characteristic of the correlo 
gram of a nonstationary series is the very slow damping out 
of the autocorrelation. When this property of the correlo- 
gram is observed, the user should difference the series one 
time and compute the correlogram for the series w^ = 

The user should continue to difference the series until the 
resulting series appears stationary or until the procedure 
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appears not to improve the series. The transformed series 
is then modeled as an ARMA model. The resulting 
model, in terms of the original time series is: 

<J)(B)V‘^y^ = + e(B)a^ . (3) 

In this form of the model, the transfer function (})(B) is 
assumed to be stationary; i.e., all roots of (})(B) = 0 are 
outside the unit circle. It is sometimes written in the 
equivalent form 

4^(B)Yt = 00 + 9 (B) a^ (4) 

where l|^(B) = (J)(B)V^ = (f) (B) (1-B)^ and, clearly, l{/(B) is not 

a transfer function of a stationary series (lf;(B) = 0 has d 

roots on the unit circle). The ARMA model for the differ- 
enced series is called an autoregressive integrated moving 
average (ARIMA) model of order (p,d,q). For the purpose of 
distinguishing between the two forms of the ARIMA model, 
equation (3) is referred to as the differenced form and (4) 
as the undifferenced form. 

D. SEASONAL TIME SERIES 

Seasonality is defined as cyclical behavior that occurs 
on a regular calendar basis. For example, a highly seasonal 
time series would be the sales of Christmas ornaments which 
exhibit a strong peak every December. Rainfall, crop yields, 
livestock production, energy consumption, and many other 
time series that are influenced by the weather all exhibit 
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seasonal patterns. Seasonal patterns are often easy to spot 
simply by observing the time series directly. However, many 
times, if the variability in the time series is large, 
seasonal patterns will not be distinguishable from the other 
fluctuations. Recognition of seasonality is important since 
it provides information that can aid in modeling and fore- 
casting. The autocorrelation function makes recognition of 
seasonal patterns easier. 

Suppose, for example, that a monthly series has an annual 
seasonal pattern. Then the realizations should show some 
special correlation with other realizations which lead or lag 
by 12 months; i.e., there should be some correlation between 

Vt ^t+12' ^t+24' ^t+36' corre- 

lations should manifest themselves in the autocorrelation 

function which should show peaks at k = 12 , 24 , 36 , etc<, 

The Box-Jenkins modeling approach for seasonal (nonsta- 
tionary) time series is to first transform the seasonal series 

to a new series which is stationary. This can often be 

A d 

accomplished by taking seasonal differences defined as 

follows : 



- i't-s = 

= *s'*s ^t> ■ 



The transformed time series {w^} (w^ = A y^) is then analyzed 

t t St 

as a stationary time series. (It may be necessary to perform 
more than one seasonal and/or differencing transformation 
before the resulting series is stationary.) Suppose that 
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the resulting ARMA model for the transformed series 
is 



(}) (B) =6^ + 0 (B) 

The model for the original series is then 

(f> (B) (l-B®) 0(B)a^ . 

E. PARAMETER ESTIMATION 

Suppose the series has been tentatively identified as 
an ARIMA (Pfd,q) model: 

(j){B)A‘^y^ ^ ®o 

where (f)(B) = 1-d) , B- . . . -d) B^ and 9(B) = 1-9. B-. ..-9 B^. There 

I p 1 q 

2 

are p + q+2 unknown parameters, i ' • * • ' ® o ' ® 1 * * ' ^q ' ^a 

that must be estimated. The Box-Jenkins procedure separates 

the estimation problem into two parts. First estimates are 

obtained for the autoregressive-moving average parameters 

2 

(b and 9, and the estimates are then made of a and 9 which 
— — a o 

are functions of the ARMA parameters. The usual procedure 

is to select those parameter values ^ and ^ that minimize the 

sum of squared errors. Let y =9 / ( 1-d) . - . . . ) and 
^ wo' ^1 ^p 

w^ = A^y - y , . Then, it can be easily shown that y = E(w^] 
t ^ t w ^ w t 

and that (5) can be rewritten as: 

(f) (B) (w^-y^) = 9 (B) a^ 

or a = 6 ^ (B) ((> (B) (w -y ) 

t c w 
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Let w, the sample mean, be the estimate of (if d > 0, y^ 

is usually 0) and let 

a^ = § ^ (B) 4) (B) ( w^-w) . (6) 

Let S($,_9) = Z . 

/S /V 

The objective is to select those parameters $ and 9 that mini- 

/V ^ 

mize S . Since the equation S is nonlinear in the para- 

meters, iterative search methods must usually be used in the 

2 

minimization. An estimate of is provided by 

/S 2 /N 

a = s ( (}) 9 ) /n-p-q 
a ■“ 

F. DIAGNOSTIC CHECKING 

After the model has been tentatively identified and para- 
meter estimates have been obtained, the next task is to test 
whether or not the original specification was correct and 
the model is adequate. The process of testing the model 
takes many forms, but usually involves at least the following 
two s teps : 

1. Generate a simulated series from the estimated model 
and compare the simulated series and its autocorrela- 
tion functions with the original series and its 
respective autocorrelation and partial autocorrela- 
tion functions. The comparison is primarily subjec- 
tive . 
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2. Compute the residuals of the estimated model from 

(6) and compare the properties of the residuals with 
those assumed for the shock terms of the actual pro- 
cess. The residuals should be normally distributed 
and uncorrelated with each other. There are many 
quantitative statistical tests that can be used to 
test hypotheses about normality and zero correlation 
A plot of the autocorrelation and partial autocorrela- 
tion functions of the residuals provides not only a test of 
whether or not the residuals are uncorrelated, but, if they 
are correlated, the plots suggest modifications to the model 
For example, suppose the model was tentatively specified as 
the ARMA(1,1) model: ( 1- 0 . 5B ) (y 2 ) = (l + 0.7B)a^ and the 
autocorrelations and partial autocorrelations of the resid- 
uals suggest the model 

( 1-0 . 3B ) a^ = u^ 

where the u*s are white noise (uncorrelated with variance 
2 

a ). Then, the next model entertained for the original 
u 

time series should be the ARMA(2,1) model: 

(1-0.3B) (1-0.5B) (y^-2) = (l + 0.7B)u^ . 

G. FORECASTING 

The objective of forecasting is to predict future values 
with as little error as possible. The criterion most often 
used for selection of the best forecast is that forecast 
which has minimum mean square forecast error. Thus, if 



23 










S- 

• L- 




m 4.H cili dai| 

■'“ .;^i 



y^(£) represents the forecast at origin T of the value 

/N 

the objective is to select so that 

E[(Yt+£ - 

/\ 

is minimized. This forecast is given by taking y^(^^) as the 
conditional expectation of Y,p^£- 

y^(£) = E . . . ,y^] (7) 

The forecasts can be easily generated recursively from the 
mathematical model utilizing the fact that 



1^1-3 1 



T-j 



for j=0,l,2 (T is current time) 



and 

Etaj 



0 if t > T 



if t £ T 



For example, suppose the estimated model is: 

2 

(1-0. 5B + 0.6B )y^ = (l+0.3B)a^ 



or, equivalently. 



i't ' »t * 



24 



^100 = ^99 


= 


1.0, and = 0.2. The forecasts 




made at time t = 100 are found as 


follows : 


1 — 1 

o 
o 
1 — 1 
<>1 


= 






= 






= 


“•^^lOO • “•^''99 °-^»100 


t — 1 

o 
o 
1 — 1 
< >1 


= 


0.16 


^100 


= 


^ ^^102 I^IOI'^IOO' * * * '^1^ 




= 


E[0.5yioi-0.6y 100+3102+0- 3a loi] 




= 


°*^^ioo^^^ ■ °*^^100 




= 


-0. 76 




^100 


= 


El0.5yio2-0.6yioi*a3^3+0.3aio2) 




= 






= 


-0.48 



Let e^(£) = “ y,p(^) the forecast error Z periods 

ahead. It can be shown that e^(il) is given by 



e^Ol) - + 



( 8 ) 



where the weights are determined from 



Ip (B) = 4)'^ (B) (1-B) (B) . 



The variance of the forecast error is given by 



Ete^^(£)] = (1 + (9) 
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From this, a confidence interval of z standard deviations 



around a forecast H periods ahead would be given by 

£-1 

j = l 

Note from expression (8) that the one-step ahead forecast 
error, e^(l), is simply i.e. 



Yt+i - yjli’ = 



This explains the common use of the word residual to refer 
to the random shock terms. Also, from expressions (9) and 
(10) it is clear that the forecast error variance is a non- 
decreasing function of the length of the forecast period I, 
Thus, the confidence bands must get wider as the forecast 
period gets larger. 
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DESCRIPTION OF THE TIME SERIES EDITOR 
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A. THE EXECUTIVE PROGRAM 

The Time Series Editor contains a master program, called 
TIMESER EXEC, that provides file control of all of the other 
programs, controls input and output, takes care of the neces- 
sary CP/CMS protocol and provides instructions to the user 
as to what is in the Time Series Editor and how each program 
can be used. TIMESER is written in a special CP/CMS Exec 
language. It is the only program in the Editor that is not 
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written in FORTRAN, and, consequently, it should be the only 



program that would need modification if the Editor were to 
be adapted to another time sharing system. 

After the user has logged into CP/CMS and linked to the 
files containing the Time Series Editor (a user's guide for 
this is given in Appendix A) the entire Editor package is 



made availab 


le to the user by 


hi s 


command , 


TIMESER EXEC . ^ 


On entry of 


this command, the 


EXEC 


provides 


a guided tour 


through the 


Editor. It tells 


the 


user what 


the Editor can 



provide; it asks the user what tasks he wants to do; and, on 
the basis of the user's answers, it instructs the user as to 
what data is required and how it must be entered. When the 
user selects an option for execution, the EXEC loads the 
appropriate program(s) and automatically manipulates any re- 
quired input and output files. 

B. DATA ENTRY AND PROGRAM OUTPUT 

Whenever data is required, the user is prompted by either 
the EXEC or the program module being executed. In most cases, 
the necessary user response is a single alphanumeric charac- 
ter input during execution by keyboard. However, in some 
cases, the amount of data required is too bulky for keyboard 
entry, and the data is entered more efficiently offline via 
cards or tapes. Similarly, most of the output is typed out 



There is a second executive routine, called TS EXEC, 
that can be used by the more experienced analysts who wish 
to suppress some of the user instructions. This abbreviated 
program provides the same basic services as TIMESER EXEC. 
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right at the user's terminal, but, in some cases, the output 
is printed offline for conservation of time and to provide 
a hard copy of the results. 

Detailed descriptions are given of the input and output 
requirements of each program in the individual program write- 
ups. However, there are some general principles of data 
input that apply for all programs. These are described in 
this section. 

1 • Offline Card Input 

When a large volume of data, such as a time series 
of a hundred or more observations, is required, the data can 
be entered more efficiently through mechanisms other than 
keyboard entry. Keyboard entry would be not only much 
slower, but also more likely to contain errors than other in- 
put mediums. Thus, the Time Series Editor requires that the 
time series data be entered offline via cards. The data are 
read offline and stored in the user's file FT02F001 which is 
read automatically when required by the Time Series Editor. 

An example of an input data deck for use in the Editor is 
shown in Figure 2. 
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Figure 2. Card deck for offline read. 
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Numeric Keyboard Input 



2 • 

When the user is prompted to enter numerical values 
such as the number of observations, the number of parameters, 
parameter estimates or starting conditions, he must enter 
those values before program execution can continue. The user 
should enter the data according to standard FORTRAN practice. 
That is, integer data should be entered (without decimal) for 
counts and names beginning with the letters I through N; float- 
ing data should be entered with decimal point for all other 
variables. Because a typed decimal point overrides a float- 
ing format it is not necessary for the user to concern himself 
with the format for floating data (the user is never asked to 
enter more than a single observation on a line) . However, 
some care must be exercised when entering integer data because 
integer data must be right justified in its format field. 

The user is told whenever the integer format is anything other 
than II. For example, suppose the user wants to analyze a 
time series having only 65 observations. If the program that 
he is executing requires the length of the time series, the 
user will receive the following request: 

ENTER LENGTH OF THE TIME SERIES, L, VIA I3. 

The user should then enter: 
col. 123 

b65 (b represents a blank space) 

If the blank were omitted, the program would read the length 
as 650 and many problems would occur. 
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3 . Alphabetic Keyboard Input 



The Time Series Editor often requests the user to 
respond with alphabetic input. For example, it may ask him 
a question that requires a yes or no answer, or it may ask 
him what option he wants to execute. The editor has been 
programmed to read only the first letter of the user's re- 
sponse. Thus, he need only enter a single letter for each 
such inquiry. For example, he should enter Y for yes, N for 
no, P for the PLOT option, E for the ESTIMATE option, etc. 

4 . Output 

Most of the results are written out right at the 
user's terminal. In some cases, however, the output is prin- 
ted offline to conserve time. Such results as plots and 
transformed time series are written onto various files 
(FT03F001, FT08F001, FTOlFOOl) and the EXEC program prints 

them offline under the user's identification number. The 
files can also be printed out at the user's terminal at his 
request . 

C. PLOT PROGRAM 

The PLOT program plots any given time series which re- 
sides in file FT02F001. Other than the time series, which 
is entered offline, the program requires only that an identi- 
fication title for the plot be entered by the user during 
execution. The plot is automatically printed offline and is 
also found in file FT08F001. The PLOT program uses the sub- 
routine PLOTP in the IBM Scientific Subroutine Package 



Library (SSPLIB) . 
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DIFF PROGRAM 



D • 

The DIFF program performs seasonal and/or nonseasonal 
differences and it allows the user to transform a given time 
series using logarithmic or exponential transformations. The 
program requires the given time series to reside in file 
FT02F001r and it requires the user to input the following 
information during program execution: 

(1) Type of time series (seasonal, nonseasonal) 

(2) Order of differencing (for s tationarity ) 

(3) Length of seasonal period (for seasonal series only) 

(4) Type of transformation (none, exponential, 
logarithmic) 

(5) Transform parameters (if log or exponential trans- 
form is desired) 

(6) Yes or no responses to questions about plotting of 
autocorrelations and partial autocorrelations of 
transformed/differenced series . 

The transformed/differenced series is written out into file 
FT03F001 and can be printed out at the user's terminal if 
he requests a printout. Plots of the transformed/differenced 
series, its autocorrelations, and its partial autocorrela- 
tions can be printed offline. 

This program is used to transform a given series that may 
be seasonal and/or nonstationary or nonhomogeneous into a 
stationary series of the type possible for analysis using 
the Box-Jenkins methodology. The program utilizes the IMSL 
subroutine, FTRDIF . 

E, AUTO PROGRAM 

AUTO calculates summary statistics from a given time 
series. The summary statistics include the sample mean. 
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variance, autocorrelations, and partial autocorrelations of 
logs one through 20. All of the statistics are printed out 
at the user's terminal, and, in addition, plots of the auto- 
correlations and partial autocorrelations are printed out 
offline. Furthermore, the plots reside in the user's file 
FT08F001 so that they may also be printed at the user's ter- 
minal. That is, however, a very time consuming process. 

AUTO provides the major information about s tationar ity , 
seasonality and model identification. With the Box-Jenkins 
procedure, the second moments (autocorrelations and partial 
autocorrelations) are the primary tools for tentative model 
identification. In addition, AUTO is used in the diagnostic 
checkout phase of model building to test the residuals. 

The time series (original, transformed or differenced, 
or residuals) must reside in file FT02F001. (When a series 
is transformed or the residuals are calculated, those values 
are automatically stored in file FT02F001 temporarily for 
further analysis.) The program uses the IMSL subroutine 
FTAUTO. 

F. ESTIMATE PROGRAM 

After the user has tentatively identified a model, the 
Editor program ESTIMATE should be executed to calculate maxi- 
mum likelihood estimates of the model parameters. It esti- 
mates the autoregressive parameters, the moving average 
parameters, the constant term and the variance of the shock 
terms. It also determines the residuals which are so impor- 
tant for the diagnostic checkout phase of model building. 



34 












i -99t 




-> 5 i 

I- 5»4XX, C 



m 

■ -., -■ ^ -.I - - -^- - :;• a»J 

31 

- ■ - - - ■- ■ -- Ofij £ : 'HA! 

- » ?ix i; ara 

- iOv in 






'■# . 



ESTIMATE uses the IMSL subroutine FTMAXL to estimate the 
parameters . 

ESTIMATE needs the time series in file FT02F001 prior to 
execution. During program execution the user will be prompted 
with the following requests: 

1) Enter number of autoregressive (AR) parameters. 

2) Enter number of moving average (MA) parameters. 

3) Enter the number of differences. 

ESTIMATE then provides the following output: 

1) Estimated AR parameters for the undifferenced form 
of the model . 

2) Estimated MA parameters. 

3) Estimated overall MA constant and white noise variance. 

4) Auto- and partial correlations of residuals. 

5) Plots of (4). 

6) Values of residuals printed out offline and residuals 
stored in file FT03F001. 

7) Chi-square goodness of fit value for estimated model. 

G. FORECAST PROGRAM 

The Editor program, FORECAST, uses the estimated mathe- 
matical model to compute forecasts of future values of the 
time series. It also computes (1-a) 100% probability limits 

for the forecasted values. The program utilizes the IMSL 
subroutine FTC AST. 

The time series must reside in file FT02F001 before exe- 
cution begins. During execution, the user is required to 
enter the following inputs: 
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1) Origin of forecasts 

2) Number of AR parameters 

3) Estimated values of AR parameters 

4) Number of MA parameters 

5) Estimated values of MA parameters 

6) Overall MA constant and white noise variance 

7) Maximum lead time for a forecast 

8) Order of differencing in model 

9) Significance level for forecast confidence limits 

The program then provides the following output: 

2 

1) AR parameters in undifferenced form 

2) Forecasts for lead times Z = 1,2,..., max 

3) Deviations from each forecast for the (1-a) 100% 

confidence limits 

4) Plots of forecasts and corresponding deviations joined 
with the original time series. This is plotted 
offline. 

H. SIMULATE AND GENERATE PROGRAMS 

Because of their similarities the Editor programs SIMULATE 
and GENERATE are described together. GENERATE allows the user 
to generate a time series from any ARIMA model that he 
specifies. The user must identify the model and give values 
for its parameters and starting conditions. The program takes 
the specified model, generates random noise terms, and 
calculates as many values of the time series as desired. 



2 

Suppose the identified model was ARIMA (1,1,0) . The 

differenced form is (l-(j) B)Ay =6 + a . The undifferenced 

^ r u o c 

orm IS ^ 2- ( i + (j)^ ) B + <t>^B ) found by multiplying 

(l-<j)^B) by (1-B) . 
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This program is useful for purposes of classroom instruction 
for the generation of a wide variety of time series examples 
for model identification. It is also useful for the diag- 
nostic phase of model checkout. A time series can be 
generated from the estimated model and its properties can be 
compared with those of the original series. If large dis- 
crepancies occur the estimated model may be inadequate. 

The program requires the user to enter the following in- 
puts during execution: 

1) A randum number seed 

2) Number of AR parameters (undifferenced form) 

3) Number of MA parameters 

4) Length of time series 

5) White noise variance 

6) Values of AR and MA parameters 

7) Initial starting values 

The program output is the generated time series. 

The SIMULATE program provides the capability of generat- 
ing any number of simulated time series*. This is useful for 
predicting what might happen in the future and to demonstrate 
that, even within a given model, the actual observed time 
series* can differ substantially. This program uses GENERATE 
but also requires as input the number of simulated series the 
user wishes to generate. Furthermore, the SIMULATE program 
allows the user to select values of the original time series 
as starting values for the simulated series’. Output consists 
of the simulated series* and plots. 
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This completes the descriptions of the programs contained 
in the Time Series Editor. The actual program listings are 
given in Appendix C and sample user sessions are shown in 
Appendix B. In the next chapter, an example is given of an 
entire time series analysis from plotting to diagnostic test- 
ing and forecasting. 
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IV. EXAMPLE TIME SERIES ANALYSIS 



In this chapter a description is given of a complete 
analysis of a time series using the Box-Jenkins procedure 
and the Time Series Editor. The time series analyzed is 
series C (Chemical Process Temperature Readings) from Box 
and Jenkins [Ref. 2, p. 528]. This time series was selected 
because it is analyzed completely in Ref. 2 so that the 
user can compare the results found there with the results 
given by the Time Series Editor. 

The first step in the analysis of series C is to plot 
the time series. This is shown in Figure 3. The plot 
reveals rather wide fluctuations in the series but not the 
sort of explosive nonstationary behavior that would render 
a modeling attempt fruitless. The plot also reveals that 
the time series has a large amount of momentum (movements 
of the series tend to resist changes of direction) . This 
is characteristic of ARIMA models with one or two differences. 

Second, the autocorrelations, partial autocorrelations, 
mean, and variance of the series were estimated using AUTO. 

The numerical values are printed out in Table I and plots 
of the autocorrelations and partial autocorrelations are 
given in Figures 4 and 5. Figure 4 shows that the autocor- 
relations dampen out slowly in a near linear fashion. This 
is an indication that the series is nonstationary and that 
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one or more differences are needed to make it stationary. 

The partial autocorrelation plot is not informative when 
the autocorrelations fail to dampen out rapidly. 

As suggested by the plots of original series and its 
autocorrelations, the program DIFF was executed to trans- 
form the series the series {w^} where w^ = Ay^ = 

y^ - y^ The values of the series {w^} are tabulated in 

Table II. The autocorrelations and the partial autocorre- 
lations of the differenced series were then calculated and 
plotted. The values are given in Table III and the corre- 
lation plots are shown in Figures 6 and 7. The correlation 
plots suggest that the first differenced series is either 
ARMA(1,0) with the AR parameter near unity (the autocorre- 
lations of the first differences also dampen out slowly) 
or a second difference is required. Thus, two candidates 
are suggested: 

1) ARIMA (1 , 1 , 0) : ( 1 - (}) B ) ( 1 - B ) y ^ = 0 + a 

1 tot 

2 

and 2) ARIMA(0,2,0) : (1-B) y^ =0 + a . 

tot 

For purposes of estimation, the second model was extended to 
include two moving average terms. Such " over f i t ting '* is 
often done to see if the estimated moving average parameters 
turn out to be near zero, thus confirming the tentative 
identification. Thus, the two models entertained were: 

1) (l-<|),B)Ay =0 + a ARIMA(1,1,0) 

1 t o t 

and 2) A^y =6 + (1-6, B - 0^B^)a^ ARIMA(0,2,2) 

^ t o 1 2 t 

The next step is to calculate maximum likelihood esti- 
mates of the model parameters. The program ESTIMATE was 
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used to do this . The estimated parameters for the ARIMA 
(1,1,0) are : 

= 0.8131 

9 =0.0 

o 

0 ^= 0.0178396 
a 

The autocorrelation and partial autocorrelations of the 

residuals were calculated and plotted to test the model. 

The correlations should appear to be estimates of a pure 

white noise process if the model is adequate. The model 

passed that test. A goodness-of- f it test was also performed 

2 

to test the model. The chi-square value was X = 21.51 with 
19 df and a significance level of 0.3082. Thus / there is 
no strong evidence to suggest that the (1,1,0) model: 

(1-1.8131B + 0.813lB^)y^ = (11) 



is inadequate. 

Parameter estimates were also obtained for the ARIMA 
(0,2,2) model using ESTIMATE. The parameter estimates were: 

0 = 0.0 

o 

0^=0 .1382 

0^ = 0.1300 

0 ^= 0.0189515 
a 
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As before, the correlation plots of the residuals fail to 
suggest any inadequacy of the model. However, the chi- 
square lack of fit test yielded: 

2 

X = 28.74 with 18 df and a significance level of 0.0516. 

Thus, if the ARIMA(0,2,2) model were the correct model a 
chi-square value as large as 28.74 would occur by chance 
with a probability of only 0.0516. There is some ground 
here for questioning the ARIMA(0,2,2) model. Because of its 
simplicity (parsimony is a very desirable feature for all 
models) and its better fit, the ARIMA(1,1,0) model, equation 
(11), was selected. 

The estimated model, equation (11), for series C was 
used to calculate forecasts and confidence limits for those 
forecasts. The forecasts were made for 13 periods into the 
future at origin t=20 using the observed values of the 
original time series as starting conditions. Figure 8 shows 
a plot of the forecasted values adjoined to the original 
series with 80% confidence limits about those forecasts. 

The forecasted values and the probability deviations are 
tabulated on page 59. The plot shows that the confidence 
limits are very wide for lags far into the future. 

Finally, program SIMULATE was utilized to generate two 
simulated series from eq. (11). Values of these two simula- 
tions are given in Table IV. Plots of those simulated 
series are shown in Figure 9. The general shape of those 
curves is like that of the original time series, thus con- 
firming the estimated model. 
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The keyboard printout of the user session that generated 



the analysis above is included in Appendix 
session lasted approximately one and a half 
eluding checkout of plots and consumed less 



The entire 
clock hours in- 
than one minute 



of CPU time* 
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SUMMARY AND RECOMMENDATIONS 



There is a growing need for an efficient unified collec- 
tion of computer programs for aiding operations researchers, 
statisticians, economists, managers, and other people who 
must analyze time series. The Box-Jenkins methodology has 
opened the doors of time series analysis to an expanding popu- 
lation of analysts. Many computational algorithms have been 
developed and are available in many forms. The problem has 
been the iterative nature of the Box-Jenkins* model building 
procedure. Such a procedure is straightforward but can be 
very time consuming. The Time Series Editor that has been 
described in this report provides a unified collection of 
computer programs in an interactive time-sharing mode that 
can aid the user in all phases of the model building procedure 
from the plotting of the time series to the forecasting of 
future values. The Editor does not develop new computational 
algorithms. Rather, it makes those that are available easier 
and faster to use. With its simple input requirements which 
are all prompted by written instructions, the Editor can 
easily be used by the most naive computer use. 

The Box-Jenkins methodology has been described in Chapter 
2 and descriptions of the modules of the Time Series Editor 
have been given in Chapter 3. These descriptions were given 
not as a substitute for study of the Box-Jenkins technique, 
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but as a communication device to explain to the potential 
user of the Time Series Editor what happens in the various 
stages of the computational process, and to explain what the 
Editor requests mean. An example time series analysis cover- 
ing all stages of the Box-Jenkins model building procedure 
was covered in Chapter 5. Appendices contain a User's Guide 
to the Time Series Editor, sample user sessions and program 
output, and program listings. 

Although the Editor covers the entire model building pro- 
cess, as described by Box and Jenkins, from the plotting of 
the series to diagnostic testing and forecasting, much more 
could be added to improve the Editor's capability and utility. 
Listed below are several options that are recommended for 
addition to the Time Series Editor. 

1. Extend the diagnostic ability to include a periodo- 
gram analysis or other tests related to the spectral 
analysis of time series. 

2. Include an option that will determine all roots of 
the characteristic equation and give the general 
solution to the autocorrelation function and to the 
eventual forecast function. 

3. Modify the FORECAST option to allow forecasts for 
seasonal nonstationary series'. 

4. Modify the ESTIMATE option to provide parameter esti- 
mates for seasonal stationary series'. 

5. Expand the univariate model building capability to 
linear transfer function model building. 

6. Expand the Time Series Editor to include multivariate 
models such as multiple regression. 
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APPENDIX A: USER'S GUIDE TO TIME SERIES EDITOR 



In order to use the Time Series Editor, the user must 
log onto CP, link to the disk storage area where the Time 
Series Editor resides, implement CMS, log into the general 
user and Time Series Editor disk areas, and enter the EXEC 
routine. It is also necessary to log out of the system at 
the completion of execution. The material which follows 
will enable the user to perform the above steps on the NPS 
CP/CMS system. Commands marked with an asterisk (*) are 
entered by the user (the asterisk itself is omitted). Those 
without an asterisk and those in all capital letters are 
system responses. Numbered sentences are comments which 
will not appear during an actual user session. The instruc- 
tions and system responses assume an IBM 2741 Input/Output 
Terminal. Some modifications may be necessary if other 
terminals are used. 

1. Turn the terminal on, depress the RETURN key, and wait 
for the system to respond: 

CP-67 online xd.65 qsyosu 

2. Depress the ATTN key. The roll bar will advance and 
the keyboard will unlock. Then enter: 

*login xxxxgnn 

3. xxxx is the user's identification number, and nn is 
the terminal number (written on the right-hand-side 

of the terminal. For example, if the user's ID is 0655 
and the terminal number is 07 the input would be: 
login 0655g07 

4. The system will respond with the statement; 



ENTER PASSWORD: 



5 . 



6 • 



7 . 



8 , 

9 . 



10 • 



11 , 



12 . 



13 , 



14. 



15. 



16. 



The user then enters his password (most users at NPS 
have the passv/ord npg) : 



*npg 



The system will then respond: 

ENTER 4-DIGIT PROJECT NUMBER FOLLOWED BY 4-CHARACTER 
COST CENTER CODE: 



The user then enters: 

* aaaabbbb 

aaaa is the assigned project number and bbbb is the 
user’s section designator or faculty code. 

The system will respond with the message of the day 
such as : 

CP/CMS HRS .. 0930-2200 ( MON-THU RS) . . 0930-1800 (FRI) OUTPUT 

RETAINED 4 DAYS 
CMS VERSION 3,2 

At this point the user is in CMS. He must then get in- 
to CP. This is done by hitting the ATTN key. The 
system will respond: 

CP 

The user must then link to the TIME SERIES EDITOR, This 
is accomplished by entering: 

*link 1969p 191 192 

The system will respond with the instruction: 



ENTER PASSWORD: 



The user then enters: 



*r f rr 

The system will respond: 

SET TO READ ONLY 

The user must now implement CMS by entering: 
*ipl cms 

The system will respond: 

CMS VERSION 3.2 
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17. Now the user must log into both his general user and 
the Time Series Editor area by entering; 

*login 191 

18. The system will respond with a message about the status 
of the file such as: 

P(191):49 FILES; 225 REC IN USE, 71 LEFT (OF 296), 

76% FULL (2 CYL) 

R; 

19. The user should then enter the command: 

* login 192 t , p 
R; 

20. The system will respond: 

T(192) R/0 

R; 

21. The user can then enter the Time Series Editor by 
issuing the command; 

*timeser exec 

22. The system will respond: 

YOU HAVE ENTERED THE TIME SERIES EDITOR 
PLEASE RESPOND TO EACH INQUIRY 
etc . 

23. The user is then on his own, guided by the Exec routine. 
See the notes that appear at the end of this Appendix 
for additional information. Eventually, the user will 
be asked: 

DO YOU WANT TO TRY AGAIN? 

24. If a yes response is given another sequence will begin. 

A no will take the user out of the Time Series Editor. 
The system response will be: 

CONTROL RETURNED TO CMS 
R; 

25. The user then can log out by entering: 

*cp logout 

26. The system will respond: 
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CONNECT= 01.17.24 VIRTCPU= 001.03.37 TOTCPU= 001.30.14 
LOGOUT AT 17.19.08 ON 3/21/77 

27. The user should then turn off his terminal and tear off 
his output. 

A sample of the procedure is shown below; 

cp-67 online xd.65 qsyosu 

login 0655g07 
ENTER PASSWORD: 
npg 

ENTER 4-DIGIT PROJECT NUMBER FOLLOWED BY 4-CHARACTER COST 
CENTER CODE: 

0986rk54 

CP/CMS HRS, MONDAY THRU THURS DAY , 0 9 3 0- 2 2 0 0 , FRI DAY , 0 9 3 0 - 1 8 00 . 

OUTPUT RETAINED 5 DAYS 

READY AT 16.13.22 ON 03/22/77 

CMS VERSION 3.2 

offline read * 

P (191): 49 FILES; 225 REC IN USE, 71 LEFT (OF 296), 76% FULL 

(2 CYL) 

OFFLINE READ FILE FT02F001 
R; 

CP 

link 1969p 191 192 

ENTER PASSWORD: 
r f r r 

SET TO READ ONLY 
ipl cms 

CMS VERSION 3.2 
login 191 

P (191): 49 FILES; 225 REC IN USE, 71 LEFT (OF 296), 76% FULL 

(2 CYL) 

R; 

login 192 t , p 
T (192) R/0 

R; 

timeser exec 

YOU HAVE ENTERED THE TIME SERIES EDITOR 



DO YOU WANT TO TRY AGAIN? 
n 

CONTROL RETURNED TO CMS 
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